Fix #1995: Use StrEnum for enum-like strings by mdboom · Pull Request #2016 · NVIDIA/cuda-python

mdboom · 2026-05-04T18:58:12Z

See discussion in #1995.

rwgk · 2026-05-04T21:26:21Z

PR 2016 Agent Review

Reviewer: Opus 4.7 1M (Max), 2026-05-04

With a few small manual edits. I read through the entire review but only checked a few of the many details.

Context

This PR is the second half of the cuda.core v1.0.0 enum hardening that PR #2014 started. Where 2014 wrapped NVML enums for cuda.core.system, this PR introduces eleven enum.StrEnum types for the compilation/linking pipeline and the memory APIs:

_program.pyx: SourceType, CodeType, CompilerBackend, PchStatus
_memory/_virtual_memory_resource.py: VirtualMemoryAllocationType, VirtualMemoryLocationType, VirtualMemoryHandleType, VirtualMemoryGranularityType, VirtualMemoryAccessType
_memory/_managed_memory_resource.pyx: ManagedMemoryLocationType
graph/_graph_definition.pyx: GraphMemoryType

It uses from enum import StrEnum with a backports.strenum fallback for Python < 3.11 (added to both pyproject.toml and pixi.toml). Argument-side methods accept either the StrEnum member or a plain string; return-side values are still mostly plain strings, matching the migration approach proposed in #1995.

Blocking Issues

These should be addressed in this PR.

1. `MemoryLocationType` is undefined (typo)

cuda_core/cuda/core/_memory/_managed_memory_resource.pyx:110

def preferred_location(self) -> tuple[MemoryLocationType, int | None] | None:

There is no symbol named MemoryLocationType anywhere in the workspace. The intended name is ManagedMemoryLocationType (introduced in this same PR). Because the file uses from __future__ import annotations, the typo is silent at import time, but typing.get_type_hints(...), IDE inference, and Sphinx autodoc all raise NameError when resolving the annotation.

Fix: rename to ManagedMemoryLocationType (already in scope at line 26).

2. `Program.backend` / `Program.pch_status` claim enum return types but return `str`

cuda_core/cuda/core/_program.pyx:178-180 and :151-175

The properties advertise -> CompilerBackend and -> PchStatus | None, but:

_program.pxd declares cdef str _backend and cdef str _pch_status.
_backend is set with self._backend = str(CompilerBackend.NVRTC) at lines 684, 694, 728.
_pch_status is set with str(status) at line 862, but with the bare enum (PchStatus.FAILED) at line 884.

Net result: program.backend is always a plain str; program.pch_status is sometimes a plain str and sometimes a PchStatus, depending on the code path. Compare with Linker.backend (_linker.pyx:171-174), which now returns a real CompilerBackend member — so two related public properties advertising the same return type behave differently.

Fix: pick one direction and apply it consistently across Program.backend, Program.pch_status, and Linker.backend. Recommended:

Change cdef str _backend / cdef str _pch_status in _program.pxd to cdef object.
Drop the str(...) wrappers in _program.pyx (lines 684, 694, 728, 862).
Keep the doc-stated enum return types.

This makes isinstance(p.backend, CompilerBackend) and isinstance(p.pch_status, PchStatus) actually true. All existing test assertions (program.backend == "NVRTC", etc.) continue to pass via StrEnum equality.

3. New public enums are documented under `api_private.rst`

cuda_core/docs/source/api_private.rst:24-34

All eleven new enums (CodeType, CompilerBackend, GraphMemoryType, ManagedMemoryLocationType, PchStatus, SourceType, VirtualMemory*Type) are listed in api_private.rst, which carries :orphan: and is described in its own header comment as "for private classes exposed to users". These enums are reachable as cuda.core.CodeType etc., live in cuda.core.__all__, and are named in user-facing docstrings — they are public API.

Fix: move them into api.rst, ideally as a new "Enums" subsection within each owning section:

CodeType, CompilerBackend, PchStatus, SourceType under "CUDA compilation toolchain".
GraphMemoryType under "CUDA graphs".
VirtualMemory*Type and ManagedMemoryLocationType under "Memory management".

Mirror the layout the system module already uses in api.rst:204-229 ("Enums" subsection under "CUDA system information…").

Recommended Polish (this PR or a quick follow-up)

These are smaller items the author may want to roll into the same PR while touching the files.

Stale comment, _program.pyx:943: # Keys here should match _utils.enum.CompilerBackend references a module path that does not exist (only _utils/enum_explanations_helpers.py is in the tree). Either move CompilerBackend (and likely CodeType/SourceType) into a shared module to also remove the lazy import in Linker.backend, or drop the comment.
Duplicated import shim: the try: from enum import StrEnum except ImportError: from backports.strenum import StrEnum block is copy-pasted in five files. Hoist to a single helper (e.g. cuda.core._utils._strenum) so the shim can be removed in one place when 3.10 support ends.
Dead __doc__ mutation, _program.pyx:69: SourceType.CXX.__doc__ = "C++ source code." is fragile (some Enum implementations make members read-only) and inconsistent — only one member of one enum gets a doc string. Drop it and document the "c++" value in the class docstring instead.
Tuple → list message change, _program.pyx:732: supported_code_types = [x.value for x in SourceType] changes the user-visible error message from (...) to [...]. The test was patched to a regex that accepts both (test_program.py:330), but it's cleaner to keep the tuple form: tuple(x.value for x in SourceType).
Stale assert, _program.pyx:733: assert code_type not in supported_code_types is trivially true at this point and reads as a leftover from earlier code. Remove.
Misleading tuple[ManagedMemoryLocationType, ...] annotation (post-fix to issue 1): once the typo is corrected, the property still returns plain strings (because _resolve_preferred_location stores plain "device" etc. into _pref_loc_type). Either weaken the annotation back to tuple[str, int | None] | None or promote _pref_loc_type to actually hold the enum.

Test Gaps

Worth adding while in this PR; cheap and they catch the regression that the blocking-issue fixes are guarding against.

assert isinstance(program.backend, CompilerBackend) somewhere in test_program.py. Currently every test only checks string equality, so a future regression that silently re-introduces str(...) would not be caught.
Same for program.pch_status returning PchStatus.
Parametrized coverage of SourceType.PTX and SourceType.NVVM in Program.__init__ (only SourceType.CXX is exercised in test_launch_invalid_values).
ProgramOptions.as_bytes(CompilerBackend.NVRTC) (the function does backend.lower(); works, but not currently tested with enum input).
For GraphAllocOptions.memory_type: parametrize _build_alloc_managed_node to also exercise enum input for DEVICE and HOST, plus string input for all three, so both halves of the "either is OK" promise are covered.

Suggested Follow-on PRs

The two items below are direct extensions of this work. They are scoped cleanly and would be easier to review separately than to fold in here.

Follow-on 1: `ConditionalNode.cond_type` enum

Background. Issue #1995 explicitly lists ConditionalNode.cond_type (returning "if", "while", "switch") as a string-vocabulary candidate for an enum, alongside everything PR 2016 already addresses. The current implementation lives at cuda_core/cuda/core/graph/_subclasses.pyx:701 and still returns plain str. It is not touched by PR 2016. Since #1995 is P0 on the v1.0.0 milestone (due 2026-05-07), closing this gap before 1.0.0 keeps the issue closable.

Suggested follow-on PR.

Add a ConditionalType(StrEnum) next to the other graph enums in cuda.core.graph._graph_definition (or a new _conditional.pyx if you prefer to keep _subclasses.pyx lean).
Members: IF = "if", WHILE = "while", SWITCH = "switch". (None of these are valid Python identifiers, so the NAME = "value" pattern matches what SourceType.CXX = "c++" already does in this PR.)
Update _subclasses.pyx to return ConditionalType instead of str from ConditionalNode.cond_type.
Add it to cuda.core.graph.__all__ and re-export from cuda.core.
Add an isinstance test mirroring whatever pattern the team picks for Program.backend in this PR.

This follow-on is small (~30 lines + tests) and directly closes the remaining #1995 surface.

Follow-on 2: bindings-vs-core enum sync test

Background. PR 2014's review surfaced a structural risk that wrappers in cuda_core can drift from auto-generated cuda_bindings enums when NVML adds new values. The flip side applies to PR 2016: most of the new StrEnums here wrap driver-level C enums (CUmemAllocationType, CUmemAllocationHandleType, CUmemLocationType, CUmemAllocationGranularity_flags, CUmemAccess_flags, nvrtcResult, …) selectively, not exhaustively. When the CUDA driver introduces a new value (e.g. a future CU_MEM_HANDLE_TYPE_*), the wrapper StrEnum will silently fall behind and a user calling, say, VirtualMemoryResourceOptions(handle_type=...) with the new value would have to bypass the enum to use it — defeating the type-safety goal.

This is not a blocker for v1.0.0 (today's enums match today's bindings), but it is the kind of debt that compounds across CTK releases.

Suggested follow-on PR. Add a parametrized sync test (e.g. cuda_core/tests/test_enum_sync.py) that imports the relevant cuda.bindings.driver / cuda.bindings.nvrtc enums and asserts the wrapper StrEnum value-set is a known subset, with a small allow-list for intentional omissions:

@pytest.mark.parametrize("wrapper, source, omitted", [
    (VirtualMemoryAllocationType, cydriver.CUmemAllocationType,
     {"CU_MEM_ALLOCATION_TYPE_INVALID", "CU_MEM_ALLOCATION_TYPE_MAX"}),
    (VirtualMemoryHandleType, cydriver.CUmemAllocationHandleType,
     {"CU_MEM_HANDLE_TYPE_NONE", "CU_MEM_HANDLE_TYPE_MAX"}),
    (VirtualMemoryLocationType, cydriver.CUmemLocationType,
     {"CU_MEM_LOCATION_TYPE_INVALID", "CU_MEM_LOCATION_TYPE_NONE",
      "CU_MEM_LOCATION_TYPE_MAX"}),
    # ...etc.
])
def test_strenum_covers_bindings(wrapper, source, omitted):
    bindings_values = {m.name for m in source} - omitted
    wrapper_values = {m.name.upper() for m in wrapper}  # or a normalization fn
    missing = bindings_values - wrapper_values
    assert not missing, f"{wrapper.__name__} is missing: {missing}"

The test fails on the cuda-bindings update PR that adds a new driver value, prompting an explicit decision (extend the wrapper, or extend the omitted set with a comment). This is exactly the kind of thing that would have caught the gap quietly during the 12.x → 13.x bindings refresh.

A version of this test should also be applied to the NVML wrappers introduced in PR 2014, so the sync guard covers both halves of the v1.0.0 enum work.

Notes on Items NOT Addressed in This Review

MemoryType unification from cuda.core: Replace string literals with enums in public API #1995. Issue 1995 proposed a single shared MemoryType enum for device/host/managed. PR 2016 splits it into three (GraphMemoryType, ManagedMemoryLocationType, VirtualMemoryLocationType) — reasonable for documentation provenance, but worth a one-line note in the PR description so reviewers don't expect the unified form. (Could also be revisited in a future PR by deriving them from a shared base or making them aliases.)

mdboom · 2026-05-04T21:54:14Z

New public enums are documented under api_private.rst
cuda_core/docs/source/api_private.rst:24-34

All eleven new enums (CodeType, CompilerBackend, GraphMemoryType, ManagedMemoryLocationType, PchStatus, SourceType, VirtualMemory*Type) are listed in api_private.rst, which carries :orphan: and is described in its own header comment as "for private classes exposed to users". These enums are reachable as cuda.core.CodeType etc., live in cuda.core.all, and are named in user-facing docstrings — they are public API.

This is to address @leofang's concern offline about "class explosion" from using enums. By documenting them as linked from functions that use the enums, and not putting them at the top-level table of contents, I think this helps quite a bit with that concern. If anything the api_private.rst (which is just a hack to document things not in the table of contents) is now mis-named and maybe we address that separately.

Duplicated import shim: the try: from enum import StrEnum except ImportError: from backports.strenum import StrEnum block is copy-pasted in five files. Hoist to a single helper (e.g. cuda.core._utils._strenum) so the shim can be removed in one place when 3.10 support ends.

Let's separate that out. Not required for 1.0

#2019

Dead doc mutation, _program.pyx:69: SourceType.CXX.doc = "C++ source code." is fragile (some Enum implementations make members read-only) and inconsistent — only one member of one enum gets a doc string. Drop it and document the "c++" value in the class docstring instead.

That suggestion does not work with our current Sphinx setup, and it is also not as ergonomic at the terminal or Jupyter.

Sync tests

As with #2014, I'll experiment with what sync testing would look like, but it is not critical for 1.0

rwgk

Only two tiny fixes. The rest looks good to me.

leofang · 2026-05-04T22:54:34Z

 CodeTypeT = bytes | bytearray | str

-cdef tuple _supported_code_type = ("cubin", "ptx", "ltoir", "fatbin", "object", "library")
+cdef tuple _supported_code_type = tuple(CodeType.__members__.values())


Ah, when reading this side-by-side it is easy to catch: We have both CodeTypeT and CodeType meaning two very different things, we probably should fix this

Would CodeFormat (for the enum with "cubin", "ptx" etc.) make sense?

Maybe ObjectCodeFormatT? The ObjectCode class can be constructed from any supported format.

Another naming convention discussion: Our type annotations have been made a bit C++-ish (but consistently) by appending a suffix T: IsStreamT, CodeTypeT, .... Maybe a good time to decide a convention and stick to it. My only concern (as raised elsewhere) is we might have both actual classes and associated type annotations refer to overlapping or orthogonal concepts. Would be nice to avoid ambiguity.

In think in this case, enums are not type annotations (they have members and runtime semantics and such), so shouldn't have the T. I know we are putting them in typing.

mdboom · 2026-05-05T14:47:35Z

This got a pretty significant rework based on @leofang's feedback. All the new enums are defined in cuda.core.typing now. The suggested renames of the enums was applied. And this ties into the testing added in #2014 that ensures that new enum values added upstream will be detected. There were a few bugfixes required to that new testing infrastructure to make work, which then revealed a couple small bugs in #2014, fixed here.

leofang

LGTM overall! Left a bunch of minor comments mainly for consistency.

mdboom · 2026-05-05T15:54:02Z

LGTM overall! Left a bunch of minor comments mainly for consistency.

Ok. I renamed everything so it has a consistent Type suffix for all of these type-annotation-like things.

leofang

We will need a release note entry, feel free to add it in this PR or the next one. Approving.

github-actions · 2026-05-05T20:45:24Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

mdboom added this to the cuda.core v1.0.0 milestone May 4, 2026

mdboom requested a review from rwgk May 4, 2026 18:58

mdboom self-assigned this May 4, 2026

mdboom added cuda.core Everything related to the cuda.core module P1 Medium priority - Should do labels May 4, 2026

This comment has been minimized.

Sign in to view

mdboom mentioned this pull request May 4, 2026

Handle Python backward compatibility shims in a single file #2019

Open

mdboom force-pushed the str-enums branch from 19132c8 to c9250cb Compare May 4, 2026 22:05

rwgk reviewed May 4, 2026

View reviewed changes

Comment thread cuda_core/cuda/core/_program.pyx Outdated

Comment thread cuda_core/cuda/core/_memory/_managed_memory_resource.pyx Outdated

leofang requested changes May 4, 2026

View reviewed changes

leofang added the enhancement Any code-related improvements label May 5, 2026

Fix NVIDIA#1995: Use StrEnum for enum-like strings

4d1ddb4

mdboom force-pushed the str-enums branch from 2481241 to 4d1ddb4 Compare May 5, 2026 14:44

mdboom requested a review from leofang May 5, 2026 14:45

mdboom mentioned this pull request May 5, 2026

Move all cuda.core.system enums into cuda.core.system.typing #2022

Merged